Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 1042774 |
| Missing cells | 181352 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 127.3 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 42149 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 45338 (4.3%) missing values | Missing |
jerseyNumber has 45338 (4.3%) missing values | Missing |
o has 45338 (4.3%) missing values | Missing |
dir has 45338 (4.3%) missing values | Missing |
s has 64324 (6.2%) zeros | Zeros |
a has 60212 (5.8%) zeros | Zeros |
dis has 64432 (6.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 15:03:36.215598 |
|---|---|
| Analysis finished | 2022-11-02 15:05:09.792761 |
| Duration | 1 minute and 33.58 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021091890 |
| Minimum | 2021091600 |
|---|---|
| Maximum | 2021092000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 2021091600 |
|---|---|
| 5-th percentile | 2021091600 |
| Q1 | 2021091903 |
| median | 2021091906 |
| Q3 | 2021091910 |
| 95-th percentile | 2021092000 |
| Maximum | 2021092000 |
| Range | 400 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 83.08845928 |
|---|---|
| Coefficient of variation (CV) | 4.111067868 × 10-8 |
| Kurtosis | 7.716409658 |
| Mean | 2021091890 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -2.87491269 |
| Sum | 2.107542075 × 1015 |
| Variance | 6903.692065 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021091905 | 75923 | 7.3% |
| 2021091910 | 74520 | 7.1% |
| 2021091912 | 73784 | 7.1% |
| 2021091600 | 73761 | 7.1% |
| 2021091904 | 71346 | 6.8% |
| 2021091911 | 68310 | 6.6% |
| 2021091903 | 67850 | 6.5% |
| 2021091906 | 66539 | 6.4% |
| 2021091900 | 65780 | 6.3% |
| 2021091908 | 65504 | 6.3% |
| Other values (6) | 339457 |
| Value | Count | Frequency (%) |
| 2021091600 | 73761 | |
| 2021091900 | 65780 | |
| 2021091901 | 62100 | |
| 2021091902 | 51819 | |
| 2021091903 | 67850 | |
| 2021091904 | 71346 | |
| 2021091905 | 75923 | |
| 2021091906 | 66539 | |
| 2021091907 | 51451 | |
| 2021091908 | 65504 |
| Value | Count | Frequency (%) |
| 2021092000 | 60559 | |
| 2021091913 | 54533 | |
| 2021091912 | 73784 | |
| 2021091911 | 68310 | |
| 2021091910 | 74520 | |
| 2021091909 | 58995 | |
| 2021091908 | 65504 | |
| 2021091907 | 51451 | |
| 2021091906 | 66539 | |
| 2021091905 | 75923 |
playId
Real number (ℝ≥0)
| Distinct | 943 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2060.790242 |
| Minimum | 54 |
|---|---|
| Maximum | 4574 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 54 |
|---|---|
| 5-th percentile | 238 |
| Q1 | 1091 |
| median | 2060 |
| Q3 | 3049 |
| 95-th percentile | 3836 |
| Maximum | 4574 |
| Range | 4520 |
| Interquartile range (IQR) | 1958 |
Descriptive statistics
| Standard deviation | 1162.202997 |
|---|---|
| Coefficient of variation (CV) | 0.5639598701 |
| Kurtosis | -1.098271353 |
| Mean | 2060.790242 |
| Median Absolute Deviation (MAD) | 979 |
| Skewness | 0.03410623412 |
| Sum | 2148938484 |
| Variance | 1350715.807 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1143 | 3933 | 0.4% |
| 2010 | 3910 | 0.4% |
| 76 | 3312 | 0.3% |
| 2531 | 3197 | 0.3% |
| 523 | 3151 | 0.3% |
| 3336 | 3105 | 0.3% |
| 2098 | 3013 | 0.3% |
| 1671 | 2967 | 0.3% |
| 1568 | 2898 | 0.3% |
| 3729 | 2875 | 0.3% |
| Other values (933) | 1010413 |
| Value | Count | Frequency (%) |
| 54 | 1035 | 0.1% |
| 55 | 1035 | 0.1% |
| 59 | 1610 | |
| 62 | 1311 | 0.1% |
| 65 | 874 | 0.1% |
| 75 | 989 | 0.1% |
| 76 | 3312 | |
| 77 | 966 | 0.1% |
| 82 | 644 | 0.1% |
| 95 | 989 | 0.1% |
| Value | Count | Frequency (%) |
| 4574 | 1403 | |
| 4552 | 851 | |
| 4530 | 989 | |
| 4518 | 644 | |
| 4489 | 1587 | |
| 4445 | 966 | |
| 4431 | 736 | |
| 4409 | 828 | |
| 4380 | 874 | |
| 4345 | 874 |
| Distinct | 1156 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 45338 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45570.49178 |
| Minimum | 25511 |
|---|---|
| Maximum | 53957 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 37130 |
| Q1 | 42410 |
| median | 45011 |
| Q3 | 47971 |
| 95-th percentile | 53465 |
| Maximum | 53957 |
| Range | 28446 |
| Interquartile range (IQR) | 5561 |
Descriptive statistics
| Standard deviation | 5013.50095 |
|---|---|
| Coefficient of variation (CV) | 0.110016389 |
| Kurtosis | -0.03850193068 |
| Mean | 45570.49178 |
| Median Absolute Deviation (MAD) | 2842 |
| Skewness | -0.1623338934 |
| Sum | 4.545364904 × 1010 |
| Variance | 25135191.78 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 40089 | 1972 | 0.2% |
| 53442 | 1972 | 0.2% |
| 41390 | 1972 | 0.2% |
| 41959 | 1972 | 0.2% |
| 43478 | 1972 | 0.2% |
| 52459 | 1972 | 0.2% |
| 52414 | 1972 | 0.2% |
| 45630 | 1972 | 0.2% |
| 43533 | 1945 | 0.2% |
| 52426 | 1914 | 0.2% |
| Other values (1146) | 977801 | |
| (Missing) | 45338 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 1377 | |
| 28963 | 1287 | |
| 29550 | 1458 | |
| 29851 | 1071 | |
| 30842 | 467 | < 0.1% |
| 30869 | 1325 | |
| 33084 | 1792 | |
| 33107 | 1399 | |
| 33131 | 650 | 0.1% |
| 33566 | 212 | < 0.1% |
| Value | Count | Frequency (%) |
| 53957 | 843 | |
| 53953 | 89 | < 0.1% |
| 53946 | 111 | < 0.1% |
| 53930 | 42 | < 0.1% |
| 53876 | 72 | < 0.1% |
| 53861 | 144 | < 0.1% |
| 53687 | 65 | < 0.1% |
| 53679 | 140 | < 0.1% |
| 53674 | 1335 | |
| 53668 | 253 | < 0.1% |
| Distinct | 171 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.55677357 |
| Minimum | 1 |
|---|---|
| Maximum | 171 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 22 |
| Q3 | 33 |
| 95-th percentile | 51 |
| Maximum | 171 |
| Range | 170 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 16.39530126 |
|---|---|
| Coefficient of variation (CV) | 0.695990952 |
| Kurtosis | 7.788103265 |
| Mean | 23.55677357 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.647163031 |
| Sum | 24564391 |
| Variance | 268.8059035 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 24541 | 2.4% |
| 12 | 24541 | 2.4% |
| 19 | 24541 | 2.4% |
| 18 | 24541 | 2.4% |
| 17 | 24541 | 2.4% |
| 16 | 24541 | 2.4% |
| 15 | 24541 | 2.4% |
| 14 | 24541 | 2.4% |
| 13 | 24541 | 2.4% |
| 11 | 24541 | 2.4% |
| Other values (161) | 797364 |
| Value | Count | Frequency (%) |
| 1 | 24541 | |
| 2 | 24541 | |
| 3 | 24541 | |
| 4 | 24541 | |
| 5 | 24541 | |
| 6 | 24541 | |
| 7 | 24541 | |
| 8 | 24541 | |
| 9 | 24541 | |
| 10 | 24541 |
| Value | Count | Frequency (%) |
| 171 | 23 | |
| 170 | 46 | |
| 169 | 46 | |
| 168 | 46 | |
| 167 | 46 | |
| 166 | 46 | |
| 165 | 46 | |
| 164 | 46 | |
| 163 | 46 | |
| 162 | 46 |
| Distinct | 42149 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.0 MiB |
| 2021-09-19T17:05:13.800 | 69 |
|---|---|
| 2021-09-19T17:22:54.400 | 69 |
| 2021-09-19T21:04:55.300 | 69 |
| 2021-09-19T18:02:59.100 | 69 |
| 2021-09-19T21:04:55.100 | 69 |
| Other values (42144) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 23983802 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 2021-09-17T00:23:09.600 |
|---|---|
| 2nd row | 2021-09-17T00:23:09.700 |
| 3rd row | 2021-09-17T00:23:09.800 |
| 4th row | 2021-09-17T00:23:09.900 |
| 5th row | 2021-09-17T00:23:10.000 |
Common Values
| Value | Count | Frequency (%) |
| 2021-09-19T17:05:13.800 | 69 | < 0.1% |
| 2021-09-19T17:22:54.400 | 69 | < 0.1% |
| 2021-09-19T21:04:55.300 | 69 | < 0.1% |
| 2021-09-19T18:02:59.100 | 69 | < 0.1% |
| 2021-09-19T21:04:55.100 | 69 | < 0.1% |
| 2021-09-19T21:04:55.000 | 69 | < 0.1% |
| 2021-09-19T21:04:54.900 | 69 | < 0.1% |
| 2021-09-19T17:22:54.500 | 69 | < 0.1% |
| 2021-09-19T17:22:54.300 | 69 | < 0.1% |
| 2021-09-19T21:04:55.500 | 69 | < 0.1% |
| Other values (42139) | 1042084 |
Length
| Value | Count | Frequency (%) |
| 2021-09-19t17:05:13.800 | 69 | < 0.1% |
| 2021-09-19t21:04:55.200 | 69 | < 0.1% |
| 2021-09-19t19:43:26.300 | 69 | < 0.1% |
| 2021-09-19t17:49:04.700 | 69 | < 0.1% |
| 2021-09-19t17:49:04.600 | 69 | < 0.1% |
| 2021-09-19t17:49:04.500 | 69 | < 0.1% |
| 2021-09-19t17:49:04.400 | 69 | < 0.1% |
| 2021-09-19t19:43:26.900 | 69 | < 0.1% |
| 2021-09-19t19:43:26.800 | 69 | < 0.1% |
| 2021-09-19t19:43:26.700 | 69 | < 0.1% |
| Other values (42139) | 1042084 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5203794 | |
| 1 | 3393972 | |
| 2 | 3311122 | |
| 9 | 2410585 | |
| - | 2085548 | |
| : | 2085548 | |
| T | 1042774 | 4.3% |
| . | 1042774 | 4.3% |
| 3 | 733078 | 3.1% |
| 5 | 651498 | 2.7% |
| Other values (4) | 2023109 | 8.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 17727158 | |
| Other Punctuation | 3128322 | 13.0% |
| Dash Punctuation | 2085548 | 8.7% |
| Uppercase Letter | 1042774 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5203794 | |
| 1 | 3393972 | |
| 2 | 3311122 | |
| 9 | 2410585 | |
| 3 | 733078 | 4.1% |
| 5 | 651498 | 3.7% |
| 4 | 624890 | 3.5% |
| 7 | 591860 | 3.3% |
| 8 | 499769 | 2.8% |
| 6 | 306590 | 1.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 2085548 | |
| . | 1042774 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2085548 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1042774 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 22941028 | |
| Latin | 1042774 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5203794 | |
| 1 | 3393972 | |
| 2 | 3311122 | |
| 9 | 2410585 | |
| - | 2085548 | |
| : | 2085548 | |
| . | 1042774 | 4.5% |
| 3 | 733078 | 3.2% |
| 5 | 651498 | 2.8% |
| 4 | 624890 | 2.7% |
| Other values (3) | 1398219 | 6.1% |
Latin
| Value | Count | Frequency (%) |
| T | 1042774 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 23983802 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5203794 | |
| 1 | 3393972 | |
| 2 | 3311122 | |
| 9 | 2410585 | |
| - | 2085548 | |
| : | 2085548 | |
| T | 1042774 | 4.3% |
| . | 1042774 | 4.3% |
| 3 | 733078 | 3.1% |
| 5 | 651498 | 2.7% |
| Other values (4) | 2023109 | 8.4% |
| Distinct | 99 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 45338 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.57896045 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 23 |
| median | 52 |
| Q3 | 75 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 52 |
Descriptive statistics
| Standard deviation | 29.93014313 |
|---|---|
| Coefficient of variation (CV) | 0.6036863794 |
| Kurtosis | -1.336541678 |
| Mean | 49.57896045 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.03295036995 |
| Sum | 49451840 |
| Variance | 895.8134678 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 21758 | 2.1% |
| 21 | 20602 | 2.0% |
| 26 | 20253 | 1.9% |
| 23 | 19989 | 1.9% |
| 76 | 19780 | 1.9% |
| 24 | 18824 | 1.8% |
| 11 | 18607 | 1.8% |
| 54 | 17228 | 1.7% |
| 91 | 16167 | 1.6% |
| 72 | 15557 | 1.5% |
| Other values (89) | 808671 | |
| (Missing) | 45338 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 11349 | |
| 2 | 21758 | |
| 3 | 6356 | 0.6% |
| 4 | 8654 | 0.8% |
| 5 | 6079 | 0.6% |
| 6 | 7970 | 0.8% |
| 7 | 7866 | 0.8% |
| 8 | 14613 | |
| 9 | 7380 | 0.7% |
| 10 | 13886 |
| Value | Count | Frequency (%) |
| 99 | 14095 | |
| 98 | 13717 | |
| 97 | 15532 | |
| 96 | 9868 | |
| 95 | 9767 | |
| 94 | 13500 | |
| 93 | 9696 | |
| 92 | 6617 | |
| 91 | 16167 | |
| 90 | 13880 |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.0 MiB |
| football | 45338 |
|---|---|
| BUF | 36311 |
| MIA | 36311 |
| ATL | 35640 |
| TB | 35640 |
| Other values (28) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 2.984980446 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3112660 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | NYG |
|---|---|
| 2nd row | NYG |
| 3rd row | NYG |
| 4th row | NYG |
| 5th row | NYG |
Common Values
| Value | Count | Frequency (%) |
| football | 45338 | 4.3% |
| BUF | 36311 | 3.5% |
| MIA | 36311 | 3.5% |
| ATL | 35640 | 3.4% |
| TB | 35640 | 3.4% |
| TEN | 35288 | 3.4% |
| SEA | 35288 | 3.4% |
| NYG | 35277 | 3.4% |
| WAS | 35277 | 3.4% |
| DEN | 34122 | 3.3% |
| Other values (23) | 678282 |
Length
| Value | Count | Frequency (%) |
| football | 45338 | 4.3% |
| buf | 36311 | 3.5% |
| mia | 36311 | 3.5% |
| atl | 35640 | 3.4% |
| tb | 35640 | 3.4% |
| ten | 35288 | 3.4% |
| sea | 35288 | 3.4% |
| nyg | 35277 | 3.4% |
| was | 35277 | 3.4% |
| den | 34122 | 3.3% |
| Other values (23) | 678282 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 360184 | 11.6% |
| N | 290158 | 9.3% |
| I | 240526 | 7.7% |
| L | 215622 | 6.9% |
| E | 190267 | 6.1% |
| C | 174394 | 5.6% |
| T | 166859 | 5.4% |
| D | 128205 | 4.1% |
| B | 126995 | 4.1% |
| S | 95172 | 3.1% |
| Other values (20) | 1124278 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2749956 | |
| Lowercase Letter | 362704 | 11.7% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 360184 | |
| N | 290158 | 10.6% |
| I | 240526 | 8.7% |
| L | 215622 | 7.8% |
| E | 190267 | 6.9% |
| C | 174394 | 6.3% |
| T | 166859 | 6.1% |
| D | 128205 | 4.7% |
| B | 126995 | 4.6% |
| S | 95172 | 3.5% |
| Other values (14) | 761574 |
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 90676 | |
| o | 90676 | |
| f | 45338 | |
| a | 45338 | |
| b | 45338 | |
| t | 45338 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3112660 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 360184 | 11.6% |
| N | 290158 | 9.3% |
| I | 240526 | 7.7% |
| L | 215622 | 6.9% |
| E | 190267 | 6.1% |
| C | 174394 | 5.6% |
| T | 166859 | 5.4% |
| D | 128205 | 4.1% |
| B | 126995 | 4.1% |
| S | 95172 | 3.1% |
| Other values (20) | 1124278 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3112660 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 360184 | 11.6% |
| N | 290158 | 9.3% |
| I | 240526 | 7.7% |
| L | 215622 | 6.9% |
| E | 190267 | 6.1% |
| C | 174394 | 5.6% |
| T | 166859 | 5.4% |
| D | 128205 | 4.1% |
| B | 126995 | 4.1% |
| S | 95172 | 3.1% |
| Other values (20) | 1124278 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.0 MiB |
| right | |
|---|---|
| left |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 4.502051259 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4694622 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| right | 523526 | |
| left | 519248 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| right | 523526 | |
| left | 519248 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 1042774 | |
| r | 523526 | |
| i | 523526 | |
| g | 523526 | |
| h | 523526 | |
| l | 519248 | |
| e | 519248 | |
| f | 519248 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4694622 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 1042774 | |
| r | 523526 | |
| i | 523526 | |
| g | 523526 | |
| h | 523526 | |
| l | 519248 | |
| e | 519248 | |
| f | 519248 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4694622 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 1042774 | |
| r | 523526 | |
| i | 523526 | |
| g | 523526 | |
| h | 523526 | |
| l | 519248 | |
| e | 519248 | |
| f | 519248 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4694622 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 1042774 | |
| r | 523526 | |
| i | 523526 | |
| g | 523526 | |
| h | 523526 | |
| l | 519248 | |
| e | 519248 | |
| f | 519248 |
x
Real number (ℝ≥0)
| Distinct | 11548 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.5512206 |
| Minimum | 1.23 |
|---|---|
| Maximum | 120.01 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 1.23 |
|---|---|
| 5-th percentile | 21.45 |
| Q1 | 41.03 |
| median | 58.96 |
| Q3 | 78.04 |
| 95-th percentile | 98.42 |
| Maximum | 120.01 |
| Range | 118.78 |
| Interquartile range (IQR) | 37.01 |
Descriptive statistics
| Standard deviation | 23.76694891 |
|---|---|
| Coefficient of variation (CV) | 0.3991009533 |
| Kurtosis | -0.8100265111 |
| Mean | 59.5512206 |
| Median Absolute Deviation (MAD) | 18.48 |
| Skewness | 0.04421791045 |
| Sum | 62098464.51 |
| Variance | 564.8678606 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 57.5 | 233 | < 0.1% |
| 69.14 | 233 | < 0.1% |
| 69.58 | 226 | < 0.1% |
| 70.67 | 225 | < 0.1% |
| 55.34 | 224 | < 0.1% |
| 69.2 | 215 | < 0.1% |
| 69.08 | 215 | < 0.1% |
| 56.76 | 212 | < 0.1% |
| 57.45 | 205 | < 0.1% |
| 57.84 | 204 | < 0.1% |
| Other values (11538) | 1040582 |
| Value | Count | Frequency (%) |
| 1.23 | 1 | |
| 1.39 | 1 | |
| 1.47 | 1 | |
| 1.56 | 1 | |
| 1.66 | 1 | |
| 1.7 | 1 | |
| 1.73 | 1 | |
| 1.82 | 1 | |
| 1.92 | 1 | |
| 1.95 | 1 |
| Value | Count | Frequency (%) |
| 120.01 | 1 | |
| 120 | 2 | |
| 119.97 | 2 | |
| 119.91 | 1 | |
| 119.9 | 1 | |
| 119.81 | 1 | |
| 119.7 | 1 | |
| 119.56 | 1 | |
| 119.49 | 1 | |
| 119.48 | 1 |
y
Real number (ℝ)
| Distinct | 5347 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.72224564 |
| Minimum | -3.42 |
|---|---|
| Maximum | 53.28 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 79 |
| Negative (%) | < 0.1% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | -3.42 |
|---|---|
| 5-th percentile | 11.63 |
| Q1 | 21.98 |
| median | 26.76 |
| Q3 | 31.55 |
| 95-th percentile | 41.53 |
| Maximum | 53.28 |
| Range | 56.7 |
| Interquartile range (IQR) | 9.57 |
Descriptive statistics
| Standard deviation | 8.249658559 |
|---|---|
| Coefficient of variation (CV) | 0.3087187607 |
| Kurtosis | 0.3276084201 |
| Mean | 26.72224564 |
| Median Absolute Deviation (MAD) | 4.78 |
| Skewness | -0.03129551577 |
| Sum | 27865262.98 |
| Variance | 68.05686635 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23.69 | 1025 | 0.1% |
| 23.75 | 1012 | 0.1% |
| 23.73 | 998 | 0.1% |
| 23.85 | 990 | 0.1% |
| 29.79 | 981 | 0.1% |
| 23.67 | 979 | 0.1% |
| 23.78 | 979 | 0.1% |
| 29.61 | 977 | 0.1% |
| 23.83 | 977 | 0.1% |
| 29.76 | 972 | 0.1% |
| Other values (5337) | 1032884 |
| Value | Count | Frequency (%) |
| -3.42 | 1 | |
| -3.15 | 1 | |
| -3.05 | 1 | |
| -3.01 | 1 | |
| -2.97 | 1 | |
| -2.93 | 1 | |
| -2.89 | 1 | |
| -2.88 | 1 | |
| -2.83 | 1 | |
| -2.79 | 1 |
| Value | Count | Frequency (%) |
| 53.28 | 1 | |
| 53.21 | 1 | |
| 53.19 | 1 | |
| 53.18 | 2 | |
| 53.17 | 1 | |
| 53.16 | 2 | |
| 53.15 | 1 | |
| 53.13 | 1 | |
| 53.11 | 1 | |
| 53.08 | 2 |
| Distinct | 2136 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.591271532 |
| Minimum | 0 |
|---|---|
| Maximum | 27.93 |
| Zeros | 64324 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.77 |
| median | 2.14 |
| Q3 | 3.83 |
| 95-th percentile | 6.78 |
| Maximum | 27.93 |
| Range | 27.93 |
| Interquartile range (IQR) | 3.06 |
Descriptive statistics
| Standard deviation | 2.388592779 |
|---|---|
| Coefficient of variation (CV) | 0.9217840546 |
| Kurtosis | 14.59087122 |
| Mean | 2.591271532 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 2.36346026 |
| Sum | 2702110.58 |
| Variance | 5.705375464 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64324 | 6.2% |
| 0.01 | 14914 | 1.4% |
| 0.02 | 8612 | 0.8% |
| 0.03 | 6627 | 0.6% |
| 0.04 | 5641 | 0.5% |
| 0.05 | 5183 | 0.5% |
| 0.06 | 4652 | 0.4% |
| 0.07 | 4259 | 0.4% |
| 0.09 | 3915 | 0.4% |
| 0.08 | 3893 | 0.4% |
| Other values (2126) | 920754 |
| Value | Count | Frequency (%) |
| 0 | 64324 | |
| 0.01 | 14914 | 1.4% |
| 0.02 | 8612 | 0.8% |
| 0.03 | 6627 | 0.6% |
| 0.04 | 5641 | 0.5% |
| 0.05 | 5183 | 0.5% |
| 0.06 | 4652 | 0.4% |
| 0.07 | 4259 | 0.4% |
| 0.08 | 3893 | 0.4% |
| 0.09 | 3915 | 0.4% |
| Value | Count | Frequency (%) |
| 27.93 | 1 | |
| 27.74 | 1 | |
| 27.62 | 1 | |
| 27.55 | 1 | |
| 27.5 | 2 | |
| 27.42 | 1 | |
| 27.37 | 1 | |
| 27.32 | 1 | |
| 27.25 | 1 | |
| 27.15 | 1 |
| Distinct | 1552 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.790390353 |
| Minimum | 0 |
|---|---|
| Maximum | 33.56 |
| Zeros | 60212 |
| Zeros (%) | 5.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.71 |
| median | 1.53 |
| Q3 | 2.58 |
| 95-th percentile | 4.47 |
| Maximum | 33.56 |
| Range | 33.56 |
| Interquartile range (IQR) | 1.87 |
Descriptive statistics
| Standard deviation | 1.444250141 |
|---|---|
| Coefficient of variation (CV) | 0.8066677405 |
| Kurtosis | 7.673518942 |
| Mean | 1.790390353 |
| Median Absolute Deviation (MAD) | 0.91 |
| Skewness | 1.520473414 |
| Sum | 1866972.51 |
| Variance | 2.085858469 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 60212 | 5.8% |
| 0.01 | 11568 | 1.1% |
| 0.02 | 6531 | 0.6% |
| 0.03 | 5014 | 0.5% |
| 0.04 | 4188 | 0.4% |
| 0.05 | 3731 | 0.4% |
| 1.13 | 3387 | 0.3% |
| 1.15 | 3342 | 0.3% |
| 1.03 | 3319 | 0.3% |
| 0.91 | 3313 | 0.3% |
| Other values (1542) | 938169 |
| Value | Count | Frequency (%) |
| 0 | 60212 | |
| 0.01 | 11568 | 1.1% |
| 0.02 | 6531 | 0.6% |
| 0.03 | 5014 | 0.5% |
| 0.04 | 4188 | 0.4% |
| 0.05 | 3731 | 0.4% |
| 0.06 | 3262 | 0.3% |
| 0.07 | 2886 | 0.3% |
| 0.08 | 2696 | 0.3% |
| 0.09 | 2623 | 0.3% |
| Value | Count | Frequency (%) |
| 33.56 | 1 | |
| 31.23 | 1 | |
| 31.02 | 1 | |
| 29.78 | 1 | |
| 28.66 | 1 | |
| 28.46 | 1 | |
| 27.9 | 1 | |
| 27.75 | 1 | |
| 27.29 | 1 | |
| 27.26 | 1 |
| Distinct | 554 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2624645513 |
| Minimum | 0 |
|---|---|
| Maximum | 7.44 |
| Zeros | 64432 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.22 |
| Q3 | 0.38 |
| 95-th percentile | 0.68 |
| Maximum | 7.44 |
| Range | 7.44 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.2568259785 |
|---|---|
| Coefficient of variation (CV) | 0.9785168215 |
| Kurtosis | 54.92772656 |
| Mean | 0.2624645513 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.396544745 |
| Sum | 273691.21 |
| Variance | 0.06595958322 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64432 | 6.2% |
| 0.01 | 56248 | 5.4% |
| 0.02 | 31973 | 3.1% |
| 0.03 | 24474 | 2.3% |
| 0.04 | 21424 | 2.1% |
| 0.05 | 19764 | 1.9% |
| 0.17 | 19349 | 1.9% |
| 0.19 | 19343 | 1.9% |
| 0.18 | 19309 | 1.9% |
| 0.21 | 19217 | 1.8% |
| Other values (544) | 747241 |
| Value | Count | Frequency (%) |
| 0 | 64432 | |
| 0.01 | 56248 | |
| 0.02 | 31973 | |
| 0.03 | 24474 | 2.3% |
| 0.04 | 21424 | 2.1% |
| 0.05 | 19764 | 1.9% |
| 0.06 | 18963 | 1.8% |
| 0.07 | 18404 | 1.8% |
| 0.08 | 18260 | 1.8% |
| 0.09 | 18107 | 1.7% |
| Value | Count | Frequency (%) |
| 7.44 | 1 | |
| 7.37 | 1 | |
| 7.05 | 1 | |
| 6.99 | 1 | |
| 6.85 | 1 | |
| 6.81 | 1 | |
| 6.79 | 1 | |
| 6.73 | 1 | |
| 6.58 | 1 | |
| 6.57 | 1 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 45338 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 181.3481154 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 11 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 33.23 |
| Q1 | 91.77 |
| median | 179.71 |
| Q3 | 270.61 |
| 95-th percentile | 329.78 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 178.84 |
Descriptive statistics
| Standard deviation | 98.50140569 |
|---|---|
| Coefficient of variation (CV) | 0.5431620034 |
| Kurtosis | -1.355062494 |
| Mean | 181.3481154 |
| Median Absolute Deviation (MAD) | 89.38 |
| Skewness | 0.003239643568 |
| Sum | 180883138.9 |
| Variance | 9702.526922 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 1927 | 0.2% |
| 280.72 | 98 | < 0.1% |
| 264.03 | 97 | < 0.1% |
| 273.73 | 96 | < 0.1% |
| 267.19 | 96 | < 0.1% |
| 90.76 | 95 | < 0.1% |
| 87.18 | 95 | < 0.1% |
| 92.19 | 94 | < 0.1% |
| 93.91 | 94 | < 0.1% |
| 268.31 | 93 | < 0.1% |
| Other values (35991) | 994651 | |
| (Missing) | 45338 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 11 | |
| 0.01 | 11 | |
| 0.02 | 10 | |
| 0.03 | 22 | |
| 0.04 | 14 | |
| 0.05 | 13 | |
| 0.06 | 20 | |
| 0.07 | 21 | |
| 0.08 | 17 | |
| 0.09 | 17 |
| Value | Count | Frequency (%) |
| 360 | 10 | |
| 359.99 | 10 | |
| 359.98 | 12 | |
| 359.97 | 21 | |
| 359.96 | 8 | < 0.1% |
| 359.95 | 9 | |
| 359.94 | 20 | |
| 359.93 | 9 | |
| 359.92 | 10 | |
| 359.91 | 17 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 45338 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.8773805 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 22 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 24.5 |
| Q1 | 92.09 |
| median | 180.54 |
| Q3 | 270.38 |
| 95-th percentile | 335.49 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 178.29 |
Descriptive statistics
| Standard deviation | 100.2772581 |
|---|---|
| Coefficient of variation (CV) | 0.5543935776 |
| Kurtosis | -1.275577476 |
| Mean | 180.8773805 |
| Median Absolute Deviation (MAD) | 89.16 |
| Skewness | -0.01011826178 |
| Sum | 180413610.9 |
| Variance | 10055.52849 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 265.92 | 72 | < 0.1% |
| 271.05 | 72 | < 0.1% |
| 270.72 | 70 | < 0.1% |
| 270.69 | 68 | < 0.1% |
| 85.88 | 68 | < 0.1% |
| 270.4 | 67 | < 0.1% |
| 266.38 | 66 | < 0.1% |
| 93.03 | 66 | < 0.1% |
| 270.78 | 66 | < 0.1% |
| 274.45 | 66 | < 0.1% |
| Other values (35991) | 996755 | |
| (Missing) | 45338 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 22 | |
| 0.01 | 20 | |
| 0.02 | 20 | |
| 0.03 | 19 | |
| 0.04 | 24 | |
| 0.05 | 17 | |
| 0.06 | 21 | |
| 0.07 | 20 | |
| 0.08 | 26 | |
| 0.09 | 14 |
| Value | Count | Frequency (%) |
| 360 | 16 | |
| 359.99 | 20 | |
| 359.98 | 25 | |
| 359.97 | 20 | |
| 359.96 | 27 | |
| 359.95 | 27 | |
| 359.94 | 25 | |
| 359.93 | 33 | |
| 359.92 | 16 | |
| 359.91 | 20 |
event
Categorical
| Distinct | 23 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.0 MiB |
| None | |
|---|---|
| ball_snap | 24426 |
| pass_forward | 21390 |
| autoevent_passforward | 10856 |
| autoevent_ballsnap | 9982 |
| Other values (18) | 13409 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.676231858 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4876253 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 962711 | |
| ball_snap | 24426 | 2.3% |
| pass_forward | 21390 | 2.1% |
| autoevent_passforward | 10856 | 1.0% |
| autoevent_ballsnap | 9982 | 1.0% |
| play_action | 6118 | 0.6% |
| run | 1518 | 0.1% |
| qb_sack | 1357 | 0.1% |
| pass_arrived | 1357 | 0.1% |
| man_in_motion | 621 | 0.1% |
| Other values (13) | 2438 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 962711 | |
| ball_snap | 24426 | 2.3% |
| pass_forward | 21390 | 2.1% |
| autoevent_passforward | 10856 | 1.0% |
| autoevent_ballsnap | 9982 | 1.0% |
| play_action | 6118 | 0.6% |
| run | 1518 | 0.1% |
| qb_sack | 1357 | 0.1% |
| pass_arrived | 1357 | 0.1% |
| man_in_motion | 621 | 0.1% |
| Other values (13) | 2438 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1029549 | |
| o | 1024696 | |
| e | 1010045 | |
| N | 962711 | |
| a | 173512 | 3.6% |
| s | 106628 | 2.2% |
| _ | 79212 | 1.6% |
| p | 76843 | 1.6% |
| l | 75555 | 1.5% |
| r | 70426 | 1.4% |
| Other values (15) | 267076 | 5.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3834330 | |
| Uppercase Letter | 962711 | 19.7% |
| Connector Punctuation | 79212 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1029549 | |
| o | 1024696 | |
| e | 1010045 | |
| a | 173512 | 4.5% |
| s | 106628 | 2.8% |
| p | 76843 | 2.0% |
| l | 75555 | 2.0% |
| r | 70426 | 1.8% |
| t | 53153 | 1.4% |
| b | 36041 | 0.9% |
| Other values (13) | 177882 | 4.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 962711 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 79212 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4797041 | |
| Common | 79212 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1029549 | |
| o | 1024696 | |
| e | 1010045 | |
| N | 962711 | |
| a | 173512 | 3.6% |
| s | 106628 | 2.2% |
| p | 76843 | 1.6% |
| l | 75555 | 1.6% |
| r | 70426 | 1.5% |
| t | 53153 | 1.1% |
| Other values (14) | 213923 | 4.5% |
Common
| Value | Count | Frequency (%) |
| _ | 79212 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4876253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1029549 | |
| o | 1024696 | |
| e | 1010045 | |
| N | 962711 | |
| a | 173512 | 3.6% |
| s | 106628 | 2.2% |
| _ | 79212 | 1.6% |
| p | 76843 | 1.6% |
| l | 75555 | 1.5% |
| r | 70426 | 1.4% |
| Other values (15) | 267076 | 5.5% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021091600 | 65 | 40031.0 | 1 | 2021-09-17T00:23:09.600 | 23.0 | NYG | right | 46.32 | 22.36 | 0.93 | 0.83 | 0.09 | 271.71 | 79.32 | None |
| 1 | 2021091600 | 65 | 40031.0 | 2 | 2021-09-17T00:23:09.700 | 23.0 | NYG | right | 46.43 | 22.39 | 1.07 | 1.05 | 0.11 | 275.92 | 72.33 | None |
| 2 | 2021091600 | 65 | 40031.0 | 3 | 2021-09-17T00:23:09.800 | 23.0 | NYG | right | 46.54 | 22.44 | 1.21 | 1.11 | 0.12 | 278.85 | 67.03 | None |
| 3 | 2021091600 | 65 | 40031.0 | 4 | 2021-09-17T00:23:09.900 | 23.0 | NYG | right | 46.65 | 22.49 | 1.32 | 1.14 | 0.13 | 282.45 | 62.63 | None |
| 4 | 2021091600 | 65 | 40031.0 | 5 | 2021-09-17T00:23:10.000 | 23.0 | NYG | right | 46.77 | 22.56 | 1.49 | 1.42 | 0.14 | 285.54 | 59.26 | None |
| 5 | 2021091600 | 65 | 40031.0 | 6 | 2021-09-17T00:23:10.100 | 23.0 | NYG | right | 46.91 | 22.65 | 1.78 | 1.67 | 0.17 | 292.08 | 56.87 | ball_snap |
| 6 | 2021091600 | 65 | 40031.0 | 7 | 2021-09-17T00:23:10.200 | 23.0 | NYG | right | 47.07 | 22.75 | 1.94 | 1.49 | 0.18 | 298.71 | 54.79 | None |
| 7 | 2021091600 | 65 | 40031.0 | 8 | 2021-09-17T00:23:10.300 | 23.0 | NYG | right | 47.23 | 22.87 | 2.17 | 1.50 | 0.21 | 303.41 | 53.02 | None |
| 8 | 2021091600 | 65 | 40031.0 | 9 | 2021-09-17T00:23:10.400 | 23.0 | NYG | right | 47.42 | 23.02 | 2.38 | 1.39 | 0.23 | 305.32 | 51.86 | None |
| 9 | 2021091600 | 65 | 40031.0 | 10 | 2021-09-17T00:23:10.500 | 23.0 | NYG | right | 47.62 | 23.18 | 2.62 | 1.43 | 0.26 | 306.31 | 50.09 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1042764 | 2021092000 | 3759 | NaN | 47 | 2021-09-21T03:07:54.400 | NaN | football | right | 67.81 | 30.72 | 5.11 | 6.08 | 0.45 | NaN | NaN | None |
| 1042765 | 2021092000 | 3759 | NaN | 48 | 2021-09-21T03:07:54.500 | NaN | football | right | 68.04 | 31.19 | 5.48 | 6.13 | 0.53 | NaN | NaN | None |
| 1042766 | 2021092000 | 3759 | NaN | 49 | 2021-09-21T03:07:54.600 | NaN | football | right | 68.34 | 31.67 | 5.83 | 5.38 | 0.57 | NaN | NaN | None |
| 1042767 | 2021092000 | 3759 | NaN | 50 | 2021-09-21T03:07:54.700 | NaN | football | right | 68.69 | 32.16 | 6.15 | 4.56 | 0.60 | NaN | NaN | None |
| 1042768 | 2021092000 | 3759 | NaN | 51 | 2021-09-21T03:07:54.800 | NaN | football | right | 69.08 | 32.65 | 6.42 | 4.09 | 0.63 | NaN | NaN | pass_forward |
| 1042769 | 2021092000 | 3759 | NaN | 52 | 2021-09-21T03:07:54.900 | NaN | football | right | 69.52 | 33.13 | 6.62 | 3.07 | 0.65 | NaN | NaN | autoevent_passforward |
| 1042770 | 2021092000 | 3759 | NaN | 53 | 2021-09-21T03:07:55.000 | NaN | football | right | 70.09 | 33.56 | 7.12 | 2.61 | 0.71 | NaN | NaN | None |
| 1042771 | 2021092000 | 3759 | NaN | 54 | 2021-09-21T03:07:55.100 | NaN | football | right | 75.23 | 35.47 | 22.95 | 0.86 | 5.48 | NaN | NaN | None |
| 1042772 | 2021092000 | 3759 | NaN | 55 | 2021-09-21T03:07:55.200 | NaN | football | right | 77.35 | 36.43 | 22.87 | 1.26 | 2.33 | NaN | NaN | None |
| 1042773 | 2021092000 | 3759 | NaN | 56 | 2021-09-21T03:07:55.300 | NaN | football | right | 79.42 | 37.39 | 22.79 | 1.59 | 2.29 | NaN | NaN | None |